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Abstract 

We  propose  a  conservative  extension  of  the  polymorphic  lambda 
calculus  (F^)  as  an  intermediate  language  for  compiling  languages 
with  name-based  class  and  interface  hierarchies.  Our  extension  en¬ 
riches  standard  with  recursive  types,  existential  types,  and  row 
polymorphism,  but  only  ordered  records  with  no  subtyping.  Bas¬ 
ing  our  language  on  F‘^  makes  it  also  a  suitable  target  for  translation 
from  other  higher-order  languages;  this  enables  the  safe  interopera¬ 
tion  between  class-based  and  higher-order  languages  and  the  reuse 
of  common  type-directed  optimization  techniques,  compiler  back 
ends,  and  runtime  support. 

We  present  the  formal  semantics  of  our  intermediate  language 
and  illustrate  its  features  by  providing  a  formal  translation  from  a 
subset  of  Java,  including  classes,  interfaces,  and  private  instance 
variables.  The  translation  preserves  the  name-based  hierarchical 
relation  between  Java  classes  and  interfaces,  and  allows  access  to 
private  instance  variables  of  parameters  of  the  same  class  as  the 
one  defining  the  method.  It  also  exposes  the  details  of  method  in¬ 
vocation  and  instance  variable  access  and  allows  many  standard 
optimizations  to  be  performed  on  the  object-oriented  code. 

1  Introduction 

The  explosive  growth  of  the  World  Wide  Web  has,  as  the  Java  phe¬ 
nomenon  demonstrates,  induced  newfound  interest  in  mobile  com¬ 
putation  for  “programming  the  Web.”  In  this  domain,  the  safety 
and  security  properties  of  programs  are  more  crucial  than  ever  be¬ 
fore.  Recent  work  demonstrates  a  clear  connection  between  secu¬ 
rity  properties  and  formal  type  systems  [28,  20,  24]. 

Type  checking  has  long  been  used  to  ensure  certain  properties 
about  the  runtime  behavior  of  programs  written  in  strongly  typed 
languages.  In  the  conventional  model,  however,  type  informa¬ 
tion  is  discarded  immediately  after  type  checking.  We  must  there¬ 
fore  trust  that  the  compiler — through  its  many  transformations  and 
optimizations — faithfully  preserves  the  source  language  semantics. 
Furthermore,  given  only  the  object  code  (without  type  informa- 
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tion),  it  may  be  difficult  or  impossible  to  verify  such  properties. 
Thus,  it  is  becoming  increasingly  important  to  preserve  full  type 
information  throughout  the  compilation  process. 

The  Java  Virtual  Machine  Language  (JVML)  [25]  was  de¬ 
signed  to  address  these  issues.  JVML  byte  code  contains  sufficient 
type  information  for  an  automatic  verifier  to  prove  memory  safety 
and  other  properties.  The  translation  from  Java  to  JVML  is  mostly 
type-preserving. 

While  JVML  does  contain  type  information  and  submits  to 
verification,  it  has  several  drawbacks.  First,  JVML  may  not  be 
a  good  fit  for  source  languages  other  than  Java  [27,  3].  For  ex¬ 
ample,  JVML  does  not  provide  direct  support  for  tail-recursion, 
higher-order  functions,  and  polymorphic  functions,  making  it  im¬ 
practical  for  implementing  functional  languages.  Second,  JVML  is 
designed  for  bytecode  interpretation,  not  as  a  compiler  intermediate 
language.  The  JVML  instruction  set  is  based  on  a  stack  machine  so 
it  is  difficult  to  perform  standard  optimizations  on  JVML.  Third, 
JVML  has  a  complex  semantics  so  it  does  not  provide  good  sup¬ 
port  for  formal  reasoning.  Much  cutting  edge  research  on  security 
and  information  flow  [28,  20,  24]  is  hard  to  incorporate  into  the 
JVML-based  framework. 

The  FLINT  project  at  Yale  [34,  35]  takes  a  different  approach. 
We  aim  to  build  a  compiler  infrastructure  for  HOT  (higher-order 
and  typed)  languages.  Our  goal  is  to  use  a  richly  typed  intermediate 
language  (based  on  the  polymorphic  A-calculus  F^  [18,  33])  as  a 
common  target  for  compiling  various  source  languages  (e.g.,  Java, 
ML).  Our  system  would  allow  the  reuse  of  common  type-directed 
optimization  techniques,  compiler  back  ends,  and  runtime  support. 

Building  a  production-quality  type-preserving  compiler  is  by 
no  means  trivial,  especially  in  the  presence  of  advanced  language 
features  such  as  parametric  polymorphism  and  higher-order  mod¬ 
ules.  Recent  advances  in  compiler  technology,  however,  are  mak¬ 
ing  type-preserving  compilation  a  reality  [37,  36];  the  current  ver¬ 
sion  of  FLINT  has  been  in  widespread  use  as  the  intermediate  lan¬ 
guage  of  the  SML/NJ  compiler  [2]  since  January  1997. 

Extending  our  typed  intermediate  language  to  handle  Java 
classes  poses  many  new  challenges.  First,  the  Java  type  system 
is  dramatically  different  from  the  F^  type  system.  Naively  com¬ 
bining  Java  and  FLINT  could  easily  lead  to  an  incomprehensible 
language.  The  challenge  is  to  abstract  the  commonality  and  to  find 
a  synergy  between  the  two  languages. 

Another  challenge  in  modeling  Java  classes  is  to  find  an  object 
encoding  that  is  faithful  to  the  Java  semantics  yet  still  supports  ef¬ 
ficient  implementation.  Existing  encodings  of  object-oriented  fea¬ 
tures  in  typed  A-calculi  [21,  5]  use  records  of  functions  as  dictio¬ 
naries  and  existential  types  for  dynamic  binding.  These  encod¬ 
ings,  however,  do  not  support  Java-like  name-based  class  hierar¬ 
chies;  two  different  class  types  with  exactly  same  set  of  methods 
and  fields  are  considered  as  equivalent  by  these  encodings.  This 
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is  not  acceptable  because  class  and  interface  names  in  Java  play  a 
critical  role  during  type-checking,  linking,  loading,  runtime  execu¬ 
tion,  and  casting.  The  explicit  class  hierarchy  is  also  crucial  for 
many  important  optimizations  on  object-oriented  languages  [10]. 

Furthermore,  because  existing  object  encodings  are  intended  as 
theoretical  models  rather  than  as  intermediate  representations,  they 
often  use  non-trivial  coercions  in  order  to  simplify  the  type  system. 
For  example.  Pierce  and  Turner  [29]  model  inheritance  using  a  pair 
of  functions  for  coercing  between  the  superclass  and  subclass  views 
of  an  object.  For  the  purposes  of  compilation,  the  cost  of  calling 
an  unknown  function  to  obtain  a  different  view  of  an  object  is  un¬ 
acceptable.  For  Java  in  particular,  these  coercions  would  typically 
be  identity  functions.  In  the  presence  of  separate  compilation,  we 
could  not  expect  optimizations  to  be  able  to  replace  these  general 
coercions  with  more  efficient  operations. 

Finally,  the  implicit  subtyping  typically  used  in  these  encod¬ 
ings  can  make  them  rather  complex,  in  some  cases  rendering  type 
checking  undecidable.  We  want  a  more  conservative  extension,  less 
powerful  than,  say,  [9],  but  able  to  support  fast  type-checking, 
an  explicit  class  hierarchy,  and  efficient  implementation. 

Our  main  contribution  is  a  formal  translation  of  Java  classes, 
interfaces,  and  privacy  into  a  variant  of  F'’^  using  simple  and  well- 
known  extensions.  Although  we  use  several  familiar  techniques, 
their  combination  to  support  the  efficient  compilation  of  an  inter¬ 
esting  subset  of  Java  is  novel. 

We  use  row  polymorphism  [31]  to  implement  Java  inheritance 
and  to  allow  objects  of  subclasses  to  masquerade  as  superclasses. 
We  use  existential  types  with  dot  notation  [8]  to  model  Java’s 
named  types  and  to  preserve  the  class  and  interface  hierarchy.  In 
addition,  we  informally  describe  how  to  extend  our  framework  to 
support  other  interesting  features  of  Java  such  as  dynamic  casts, 
checked  exceptions,  and  mutually  recursive  declarations. 

Although  our  intermediate  language  contains  full  type  informa¬ 
tion,  the  implementation  of  object  creation,  method  invocation,  and 
field  selection  are  all  quite  efficient.  This  is  because  the  coercions 
used  to  implement  inheritance  and  superclass  subsumption  all  re¬ 
duce  to  no-ops  under  a  type-erasure  dynamic  semantics.  Although 
our  approach  to  implementing  interfaces  appears  to  be  unconven¬ 
tional  (casting  an  object  to  interface  type  requires  a  simple  coer¬ 
cion),  it  allows  interface  method  invocations  to  have  the  same  cost 
as  ordinary  method  invocations.  From  our  experience  with  imple¬ 
menting  functional  languages,  we  believe  the  cost  of  the  coercion 
can  be  paid  for  by  the  fast  method  accesses  it  enables.  Since  all 
these  operations  are  implemented  in  terms  of  standard  type  appli¬ 
cation,  record  selection,  and  function  application,  they  are  subject 
to  standard  optimizations.  Furthermore,  the  interaction  of  these 
features  with  other  (non-Java)  F‘^  code  can  be  well  understood. 

The  remainder  of  this  paper  formulates  the  source  and  target 
languages  (Sections  2  and  3),  illustrates  the  object  layout  (Sec¬ 
tion  4),  explains  the  translation  algorithm  (Section  5),  discusses  the 
possible  extensions  (Section  6),  and  then  closes  with  related  work 
(Section  7)  and  conclusions  (Section  8). 

2  The  Source  Language 

The  source  language  Javacito  is  a  small  calculus  representing  a  re¬ 
stricted  subset  of  Java.  It  contains  the  essence  of  Java  classes,  inter¬ 
faces,  and  access  control,  with  several  simplifications  that  permit  a 
concise  formal  semantics  and  a  comprehensible  translation. 

The  program  in  Figure  1  contains  three  classes  and  one  inter¬ 
face  that  demonstrate  some  interesting  features  of  Javacito.  Class 
SPt  overrides  method  move,  and  inherits  methods  bump  and  max. 
The  keyword  super  is  used  to  invoke  a  statically  bound  method 
from  the  superclass  (cf.  Figure  1,  line  10).  Methods  are  selected 


class  Pt  { 

private  int  x  =0; 

public  Pt  max  (Pt  other)  {  (this.x  >  other.x)?  this  :  other  } 
public  void  move  (int  dx)  { this.x  =  this.x  -i-  dx;  } 
public  void  bump  0  {  this.move  (1);  } 


} 

interface  Zm  {public  void  zoom  (int  s);} 

class  SPt  ext  Pt  imp  Zm  { 
private  int  scale  =  1 ; 

public  void  move  (int  dx)  {  super.move  (this.scale  *  dx); } 
public  void  zoom  (int  s)  {  this.scale  =  this.scale  *  s;  } 

} 

class  Main  { 

private  Pt  p  =  new  Pt; 

private  SPt  sp  =  new  SPt; 

public  void  zoom2  (Zm  z)  {  z.zoom  (2);  } 
public  void  main  ()  { 
this.p.bump  (); 


this.zoom2  ((Zm)  this.sp); 
this.sp.bump  (); 


this.zoom2  ((Zm)  (SPt) 


} 


/ /cf.  Section  6 
this.p.max  ((Pt)  this.sp)); } 


24  (newMain).main(); 


Figure  1:  Sample  Javacito  program. 


P 

deal 


field 

msig 

meth 

ty 


deal*  exp  exp  :: 

Interfaced  extd* 

{(m«g;)*} 

class  c  extc'  impd* 

{field*  meth* } 
private  ty  f  =  exp; 
public  ty  m  {{ty  x)*) 
msig  exp 

i  I  c  val  :: 


val 

{exp{;exp)*  } 

newc 

(ty)  exp 

exp.m  {exp* ) 

c-  val  >  super .m  {exp* ) 

oexp.f 

Oexp.f  =  exp 

X 


i  e  InterfaceNames  f  e  FieldNames 

c  e  ClassNames  U  {Object}  m  £  MethNames 

X  e  VarNames  U  {this} 


Figure  2:  Syntax  of  Javacito.  Underlines  indicate  annotations  re¬ 
quired  by  the  contextual  operational  semantics. 


based  on  the  dynamic  class  of  the  receiver.  When  the  bump  method 
is  invoked  on  an  object  of  class  SPt,  it  is  delegated  to  the  bump 
method  in  class  Pt  (line  5),  which  then  invokes  the  move  method  in 
class  SPt  (line  10). 

Classes  have  private  mutable  fields  and  public  methods.  A  class 
name  may  appear  recursively  in  the  types  of  its  fields  and  methods 
(line  3).  A  method  in  class  Pt  can  access  the  private  fields  of  ar¬ 
guments  which  statically  have  the  same  class  type.  Objects  can  be 
passed  to  methods  expecting  arguments  of  a  superclass  (line  22)  or 
interface  type  (line  19). 

We  also  support  several  features  not  demonstrated  by  this  sam¬ 
ple  program.  An  interface  may  extend  several  other  interfaces  and 
a  class  may  implement  several  interfaces.  A  class  may  contain  a 
field  which  is  an  instance  of  the  class  itself  (discussed  further  in 
Section  5.1.1),  and  new  instances  of  a  class  may  be  created  inside 
its  own  methods  and  field  initializers. 

Java  features  we  do  not  support  in  Javacito  include  null  refer- 
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Environment  syntax: 

Type  signature 

T  ty\ty^...ty^^ty 

Hierarchy 

7i  0  I  l±l  {i  !->■  [i*]}  |  ?^  l+l  {c  i-)-  {c' ,  [i*])} 

Member  type  environment 

£  0  I  Utility  !->■  {[{/,ty)*],[<m,T)*])} 

Member  code  environment 

TZ  0  I  7^  l+l  {c  !->•{{{/,  evp)*},  {{m,  (*1  evp)*})} 

l“H 

Eh  T7  i  ^  dom(?^) 

Vj  £  {l..w}.  {ij  ^  ij,  ^  j  ^  f  and  n  Et  b) 

Eh  i±i  {i  !->•  [ii, . . .  ,i„]} 

Eh  c  ^  dom(?^)  H  E,  c' 

Vj  £  {l..w}.  {ij  ^  ij,  ^  j  ^  f  and  n  Et  ij) 

Eh  i±i  {c  !->■ 

Ee 

T7  Eg  £■  /ye  dom(?^) 

Vi  e  {!..«}.  n  Et  tyj  Vi  e  {i..n}.  fj  =  fj,  ^  i  =  i' 
V/j  e  {l..p}.  7i  Et  Tjfe  VA;  e  {l..p}.  ruk  —  mu’  =>  k  —  k' 
Mk  e  {l..p}.  V/y'.  (/y  /y'  and  {mk,T')e-HSty')  T<;  -  T' 

Ee  £■  I±I  {/y  !->■  <[</,-,  [{Tnife,Tjfe)'=^[i..p]])} 

Figure  3:  Environment  formation. 


ences,  public  fields,  private  methods,  static  members,  protected, 
package  scope,  final,  constructors,  finalizers,  mutually  recursive 
classes,  exceptions,  reflection,  and  concurrency.  As  discussed  in¬ 
formally  in  Section  6,  many  of  these  features  are  simple  extensions 
of  the  framework.  Although  a  dynamic  cast  appears  in  the  example 
(line  21),  we  also  treat  this  feature  as  an  extension. 

We  use  integers,  arithmetic  operations,  conditional  expressions, 
and  void  in  the  example  without  defining  them  formally.  We  do  not 
distinguish  between  statements  and  expressions;  rather,  we  use  an 
ML-like  sequence  expression  with  syntax  {  exp{‘,  exp)*  }.  The  last 
expression  in  this  sequence  is  implicitly  returned. 

Finally,  we  assume  several  transformations  have  been  made  to 
the  Java  code  in  a  pre-processing  phase.  For  instance,  all  implicit 
references  to  this  are  made  explicit.  Overloaded  method  refer¬ 
ences  are  resolved  statically  by  including  argument  types  as  part 
of  the  method  name.  Instance  variable  shadowing  is  not  an  issue 
in  this  restricted  language  because  all  fields  are  private,  so  super  is 
only  needed  for  invoking  methods  in  the  superclass.  Implicit  sub¬ 
sumption  is  made  explicit  by  inserting  sequences  of  upward  casts. 
Similar  transformations  are  defined  formally  in  several  papers  by 
Drossopoulou  and  Fisenbach  [11,  12]. 

The  syntax  of  Javacito  is  given  in  Figure  2.  The  underlined 
terms  in  Figure  2  are  annotations  required  by  our  operational  se¬ 
mantics.  In  all  terms,  the  prefix  £>  indicates  that  the  term  is  stati¬ 
cally  enclosed  in  the  declaration  of  class  c.  In  the  super  term,  the 
additional  val  in  the  annotation  holds  the  value  of  this.  These  an¬ 
notations  are  required  by  the  contextual  operational  semantics,  as 
explained  below. 

2.1  Static  semantics 

Figures  3  and  4  contain  the  syntax  of  static  environments, 
rules  for  environment  formation,  and  definitions  of  environment- 
summarizing  relations.  A  hierarchy  is  an  environment  which 
maps  class  or  interface  names  to  their  immediate  superclasses  and 


Hierarchy  relations: 
c  c'  ^  c'  —  7ri(?f(c)) 

Class  is  declared  as  an  immediate  subclass 

i  i'  i'  &  'H-ii) 

Interface  is  declared  as  an  immediate  subinterface 
c  i  i  e  'K2{'H{c)) 

Class  declares  implementation  of  an  interface 

Derived  hierarchy  relations: 

-<-H  =  -<%i  U  U  Type  is  an  immediate  subtype 

<%i  =  reflexive  transitive  closure  of  Class  is  a  subclass 

<'fi  =  reflexive  transitive  closure  of 
Interface  is  a  subinterface 

c  I  <1=^  3c' ,  i'  s.t.  c  <%i  c'  and  c'  i'  and  i'  i 
Class  implements  an  interface 

<-u  =  <%i  U  U  Type  is  a  subtype 

Membership  relations: 

{/,  A)C£C  {f,X)  €  7:i{£{c))  :  Field  is  declared  in  class 

{m,X)^sty  {m,X)  e  7r2(£’(/y))  :  Method  is  declared  in  type 

Derived  membership  relations: 

{f,X)en£C  ^  {f,X)^%c!  andc  c' 

Field  is  contained  in  class 
{m,X)  €^£C  44  {m,  X)^£C  and 

c'  =  min{c"  |  c  c"  and  {m,  _)^£c"} 
Method  is  contained  in  class 
{m,X)€'-HEi  44  {m,X)^£i  andi  i 

Method  is  contained  in  interface 
=  ^%LS  El  ^us'-  Member  is  contained  in  type 

Figure  4:  Environment  relations. 


superinterfaces.  The  hierarchy  formation  judgment  Eh  guarantees 
that  a  new  type  is  added  to  the  hierarchy  at  most  once,  and  that 
all  supertypes  are  already  defined  and  mentioned  only  once.  The 
member  type  environment  £  maps  types  to  lists  of  field  and  meth¬ 
ods  names,  paired  with  the  respective  types.  The  formation  rule 
ensures  that  the  names  are  unique,  the  types  are  valid,  and  that  over¬ 
riding  a  method  preserves  its  type.'  The  member  code  environment 
72.  maps  class  names  to  a  set  of  field  initializers  and  method  bodies. 
This  environment  is  created  during  elaboration,  for  use  at  runtime. 

We  use  a  set  of  hierarchy  and  membership  relations  similar  to 
those  used  to  describe  ClassicJava  [17],  except  that  ours  are 
defined  in  terms  of  environments  7f  and  £  rather  than  on  com¬ 
plete  programs.  The  membership  relations  apply  to  both  type  and 
code  environments.  Note  that  for  methods  contained  in  class  c 
{{m,X)  €^£c),  the  auxiliary  datum  X  is  drawn  from  the  nearest 
superclass  of  c  that  declares  the  method  m.  When  a  code  environ¬ 
ment  72  is  used  in  place  of  £,  this  has  the  effect  of  selecting  the 
appropriate  method  body  by  using  the  inheritance  hierarchy. 

Figure  5  contains  the  typing  judgments  comprising  the  static 
semantics  of  Javacito.  Declarations  are  processed  one  at  a  time; 
they  may  be  recursive,  but  not  mutually  recursive.  All  declarations 
extend  the  hierarchy  72  and  the  member  type  environment  £.  Class 
declarations  also  extend  the  member  code  environment  72,  which  is 
used  to  retrieve  code  at  runtime.  Unlike  in  ClassicJava,  there  are 
no  rules  for  subsumption.  Instead,  we  require  explicit  upward  casts. 
Casts  are  only  legal  from  some  type  to  its  immediate  supertype 
(exp-cast).  Privacy  is  enforced  in  (exp-get)  and  (exp-set),  the  rules 
for  field  access.  A  field  /  can  only  be  accessed  from  within  the 
class  c  that  declares  it.  The  object  from  which  /  is  selected  may  be 

^  Rules  for  correctness  of  empty  environments  are  omitted.  Our  notation  for  lists 
and  list  comprehensions  is  the  usual,  but  we  abuse  set  membership  notation  on  lists. 
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(prog-deci) 

n\S\n\-i  deci^n'-,£'-,n' 

TZ'  hp  decli . . .  decln  exp  :  ty  %"•,  TZ" 
'H',£',TZ  hp  decl  deck  . . .  decln  exp  :  ty  %"■,  TZ" 

(prog-exp) 

Object  he  exp  :  ty 
'H-,£-,TZ\-p  exp  ■.  ty  ^Ti]  TZ 


(typ-ok) 

ty  €  dom(?^)  U  {Object} 


(typ-msig) 

V}  e  n  h,  ty- 

nk,ty _ 

l“t  O'!  •  •  •  O’n  O’ 


(meth-body) 

%■,£■,  {this  :  c,  *1  :  ty^,.. .  ,Xn  ■  ty^}',  c  he  exp  :  ty 
'H\S’,c\-mty  m(ty^  xi,. . .  ,ty^  Xn)exp 


l“d 

(decl-iface) 

Tl'  ^Tl\ti{i^[ii,...in\} 

S'  <[],  Ti'  hE  S' 

Ti',S]TZ\-p  interfaced  extdi  . . .  in{msig-^ ; . . .  msig^',}  ^'H'\S'’,TZ 

(decl-class) 

Ti'  —  Ti  ^  {c  >-¥  {c' ,[ii .. .  in])} 
hn  Ti' 

r  =  5  O  {c  ^ 

Ti'  hE  S' 

TZ'  -TZW{ci-^  {{{fk,expf.)  I  k  e  {l..m}}, 

{MR{msig,  exp',)  \  I  e  {l..p}})} 

V}  e  {l..n}.  if  {m' ,T')e'-Hsij  then  {m',T')ensC 
\/k  €  {l..m}.  %'■,  S'-,  0;  c  he  exp^,  :  ty^, 

V/  e  {l--p}-  Ti'-,  S'-,  c  hm  msig,  exp', 
class  c  extc'  impdi  . . .  d„ 

'H-,S-,TZ\-p  {tyi  fi=expi-,. .  .ty^  fm=exp^-,  ^  Ti' -,  S' -,TZ' 
msig^  exp'^  . . .  msig^^  exp'^} 


(exp-var) 

X  €  dom(r) 

T;  .  he  a;  :  r(a;) 


(exp-cast) 

;  £■;  F;  c  he  exp  :  ty  ty  ■<-h  ty 
;  £■;  F;  c  he  ity)exp  :  ty 


(exp-call) 

Ti;  £';F;c  he  exp  :  ty'  {m,  [tyi-.-ty^^  ty))^uEty' 
V}  e  {l..n}.  £■;  F;  c  he  exp-  :  ty- 

;  £■;  F;  c  he  exp.m  {exp-^  . . .  exp^)  :  ty 
(exp-super) 

c  c'  {m,  (ryi . . .  ->■  ty))€-Hec' 

V}  e  {l..n}.  £■;  F;  c  he  exp^  :  ty^ 

;  5;  F;  c  he  c  ■  this  >  super  .to  (expj  . . .  exp^)  :  ty 


(exp-new) 

;  he 

Ti-,  --,  --,  -  he  new  c  :  c 

(exp-get) 

'H-,S-,T-,c\-e  exp  -.  ty' 
ty'  <ne  {f,ty)^%c 
;  £■;  F;  c  he  c  >exp.  f  :  ty 


(exp-body) 

V}  e  {l..n}.  Ti;  S-,  F;  c  he  expj  :  tyj 
;  £■;  F;  c  he  {  expi ; . . .  ;exp„  }  :  ty^ 

(exp-set) 

£■;  F;  c  he  expj  :  ty'  ty'  c 
?f;g;F;che  expa  :  ry  {f,ty)^%c 
;  5;  F;  c  he  c  >expi ./  =  exp2  :  O' 


Utility  functions: 

MT(ry  mCtyi  xi, . . .  ,ty^  Xn))  =  {m,  (ryj . . .  ry„  ->■  ry)) 
MR(ry  m(ryi  *1, . . . ,  ry„  a;„)eyp)  =  {m,{xi  . .  .Xn)exp) 


Figure  5:  Static  semantics  of  Javacito. 


a  subclass  of  c,  however. 

Proposition  1  (Decidahlllty)  hp  is  decidable. 

2.2  Operational  semantics 

Figure  6  contains  the  contextual  small-step  operational  semantics 
of  Javacito.  We  introduce  two  new  terms:  object  expressions 
(c,l(f  I— >  exp)*])  and  locations  loc.  The  runtime  store  S  maps 
each  location  to  an  object  value,  which  consists  of  a  class  tag  and  a 
collection  T  of  field-value  bindings. 

An  evaluation  context  E  is  an  expression  with  a  hole,  which 
encodes  the  search  for  the  next  redex.  }E . . .  }  indicates  that  the 
first  expression  of  a  sequence  is  always  the  redex.  An  argument  list 
of  the  form  (yah, . . . ,  valn,E, . . .)  means  that  all  expressions  to 
the  left  of  the  redex  must  be  values.  The  state  of  the  computation  is 
characterized  completely  by  a  term  paired  with  a  store.  The  primi¬ 
tive  reduction  relation  <—¥  is  defined  between  these  states,  given  the 
class  hierarchy  TI  and  the  code  TZ.  We  do  not  need  error  states; 
since  Javacito  does  not  have  null  references  or  dynamic  casts,  no 
runtime  errors  can  occur.  Rather,  evaluation  gets  stuck  if  one  of  the 
side  conditions  of  a  reduction  is  not  met. 

An  object  expression  is  used  to  represent  partially  initialized 
objects  during  the  reduction  of  new  c.  Once  all  the  field  expressions 
are  reduced  to  values,  we  allocate  a  new  location  in  which  to  store 
the  object  literal. 


Note  how  the  annotations  are  used  in  defining  reductions.  In  (r- 
super),  we  retrieve  the  method  m  from  c' ,  the  immediate  superclass 
of  the  class  c  which  contains  the  snper  call.  The  loc  annotation  in 
(r-super)  is  the  location  of  this.  In  the  rules  (r-get)  and  (r-set)  for 
field  access,  the  class  annotation  is  used  to  enforce  privacy;  /  must 
be  declared  in  the  class  which  contains  the  field  access  expression. 

Even  though  an  upward  cast  has  no  effect  at  runtime,  the  side 
condition  of  (r-cast)  ensures  that  evaluation  will  get  stuck  if  a  cast 
is  somehow  applied  to  an  object  of  inappropriate  type. 

Javacito  is  sound,  meaning  that  a  program  which  passes  the 
static  type  checking  will  not  get  stuck  at  runtime.  Thanks  to  the 
strong  side  conditions  on  (r-get)  and  (r-set),  this  also  means  that 
privacy  will  not  be  violated  at  runtime.  To  formally  state  and  prove 
soundness  we  use  standard  techniques  and  extend  the  typing  re¬ 
lation  to  a  relation  TL]  TZ  Fc  (exp,  S)  -.  ty  on  runtime  configura¬ 
tions  (see  e.g.  [13]),  which  allows  us  to  prove  the  subject  reduction 
and  progress  properties.  Details  can  be  found  in  the  accompanying 
technical  report[23]. 

3  The  Target  Language 

The  target  language  MiniFlint  is  a  call-by-value  variant  of  the 
omega-order  polymorphic  lambda  calculus  (F’“)  [18,  33].  We  ex¬ 
tend  E'-^  with  standard  constructs  for  recursive  types  of  base  kind 
(with  explicit  fold  and  unfold),  existential  types  with  dot  nota- 
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Syntactic  extensions: 

Object  expression  exp  ...  |  {c,  [(/  i— >  exp)*]) 

Location  value  val  ...  \  loc 

Field  map  T  0  |  .F  ttl  [/  i->  val] 

Runtime  store  S  0  |  l±l  {loc  i— > 

Evaluation  contexts: 

E  ::=  .  I  {£;...}  I  {ty)E 

I  {c,  [/  !->•  vail, .  ..,/!->■  vain,  f  ^  E,..]) 

I  E.m  (...)  I  val.m  (vali , ... ,  vain  ,E,. .  .) 

I  c  ■  vflZ  >  super  .m  (val, . . . ,  vain  ,E,. . .) 

I  c  >E.f  I  oE.f  =  exp  I  c  >val.  f  =  E 

Primitive  reductions: 

(r-new)  7i’,7l  F  {£^[newc], 5)  ^  {E[{c,lF)),S) 

where  E  ^\f  exp]  {f,  exp)e^^c] 

(r-alloc)  n;n  \-  {El{c,[fj  ^  ^ 

{E[loc],S[loc  ^  {c,  [fj  ^ 
where  loc  ^  dom((S) 

(r-body)  'H’,TZ  F  {£^[{  val;  evpi ;  . . . ;  exp.^  }],  S) 

^  {£;[{  expi ;  . . . ;  }],  5) 

(r-return)  F  {£;[{  vai }],  S)  ^  {£^[vaZ],  5) 

(r-call)  'H\Tl  F  {£^[Zoc.m  (vaZi  . . .  vain)),  S) 

^  {£^[exp[/oc/this,  vaZi/xi  .  . .  valn/xn]],  S) 
where  S{loc)  —  {d ,  _) 
and  (m,  {x\  . .  .  Xn)  exp)€%ijiC 

(r-super)  LZ;  72.  F  {E\c  ■  Zoo  super  .to  (vaZi . . .  vain)),  S) 

^  {£^[exp[Zoc/this,  vaZi  jxi  .  . .  valnjxn]),  S) 
where  c  c' 

and  (m,  {x\  . .  .  Xn)  exp)€%mC 

(r-get)  7Z;72  F  {E\_c>loc.f),S)  {E[val),S) 

where  S{loc)  —  (d  ,IF)  and  !f{{)  —  val 
and  {/,  -)^^c  and  d  c 

(r-set)  7Z;  72  F  (E\c>loc.f  =  val),  S) 

{E[val),S[loc  ^  {c,E')]) 
where  S{loc)  —  (d  ,IF)  and  d  <%i  c 
and  T'  —  lF[f  val]  and  {/,  -}^%c 

(r-cast)  7Z;  72  F  {£;[(fy)  loc),  S)  <£;[Zoc],  S) 

where  S(loc)  —  (c,  _)  and  c  <-h  ty 

Figure  6:  Operational  semantics  of  Javacito. 

tion  [8],  tuples  with  row  polymorphism,  and  a  store.  The  syntax 
is  given  in  Figure  7,  and  the  static  semantics  in  Figure  8.  Notice 
that  we  only  list  the  formation  rules  for  environments,  types,  and 
terms;  the  type  reduction  and  equivalence  rules  are  omitted. 

Following  Remy  [31]  we  introduce  a  kind  of  rows  ,  where 
L  is  the  set  of  labels  banned  from  the  row.  Abs^  is  the  empty  row 
missing  L,  and  constructs  a  row  from  the  element  I  of  type 

T  and  the  row  t  .  The  record  constructor  {  •  }  lifts  a  complete  row 
type  (with  no  labels  missing)  to  kind  Q. 

Consider  the  following  example,  a  function  which  fetches  field 
I  from  a  record,  regardless  of  what  other  fields  are  present: 

let  fst  =  Aa; :  {Z :  Int;  t}.  (x.l) 

in  fst  [Abs^‘^]  {Z  =  5}  +  fst  [m :  Int;  Abs^’’™^]  {Z  =  6,  m  =  7} 

We  say  the  function  fst  is  polymorphic  in  the  tail  of  its  argument. 


K  n  I 

T  ::=  Ref  t  \  t  —¥  t' 

I  t  I  x.Typ  I  VtllK.  T  I  3t\\K.  T  \  pt.T  \  XtWK.  T  \  t  t' 

I  Abs^  I  Z :  t;  t'  |  {t} 
e  ::=  ref  e  |  !  e  |  e  :=e' 

I  a;  I  Aa; :  T.  e  I  e  a;  I  let  a;  =  e'  in  e 
I  fold  e  as  T  |  unfold  e 
I  ktwK.e  I  e[T]  I  {t\\K  =  T,e\T')  \  a;. val 
I  {(Z  =  e)*}  I  e.Z 

t  F  TypeVars  x  F  Vars  I  €  Labels  L  e  V  (Labels) 
Derived  forms: 

{Zi  :  Ti: . . .  Z„  :  T„}  =  {Zi  :  Ti: . . .  Z„  :  T„:  Abs^'i’  -'">} 

1  =  {Abs®} 

e  d  =  let  a;  =  e  in  let  a;'  =  e'  in  a;  a;' , 

where  x  is  not  free  in  d 

Figure  7:  Syntax  of  MiniFlint. 

Types 

F  A  env  A(x)  —  ditwK.  t  F  A  env 

A  F  a;.Typ  ::  k  A  F  Abs^  ::  R^ 

AdTwLl  AFt'  ::  A\- t  w  R^ 

AFZ:t:t'  ::  A  F  {t}  ::  0 

Terms 

A,  X  :t'  h  e  :  T  A\-  t  w  LI  A  F  e  :  {Zi  :  n:  ..Z„  :  Tn',  t'} 
A\-  Xx\t'  .e  ■.  t'  —¥  T  Ah  e.ln  '■  Tn 

_ Vj£{l-w}-  Ahej  :  tj _ 

A  F  {Zi  =ei,..Z„  =  e„}  :  {Zi :  n:  ..Z„  :  t„} 

Ah  e  :  T[pt.  r/t]  A  h  t'  v.  k  Ah  e  :  t[t'  ft] 

A  F  fold  e  as  pt.  T  :  pt.r  Ah  {t::K  =  T' .eir)  :  BU-.k.t 

Ahe  :  pt.T  F  A  env  A(x)  —  3t::K.T 

A  F  unfold  e  :  T[pt.T/t]  A  F  a;. val  ::  T[a;.Typ/t] 

Figure  8:  Static  semantics  of  MiniFlint — 
less  common  rules. 

Since  these  are  ordered  records,  the  use  of  labels  here  is  strictly  for 
clarity  of  presentation.  In  an  implementation  L  could  be  an  integer 
indicating  the  offset  of  the  row  from  the  beginning  of  the  enclosing 
record.  Note  also  that  record  terms  are  just  non-extensible  tuples. 

An  existential  type  3t::K.  r  is  the  type  of  a  pair  containing  a 
hidden  type  t'  of  kind  n  and  a  term  of  type  t\t' /t].  We  use  dot 
notation  [8]  to  access  the  type  (a;.Typ)  and  term  (a;. val)  compo¬ 
nents  of  a  package  x.  Although  this  is  a  form  of  dependent  type, 
the  calculus  remains  decidable  because  we  restrict  the  -.Typ  oper¬ 
ation  to  term  variables  x.  We  provide  a  let  construct  to  bind  term 
variables  and  limit  their  scope.  Abstract  types  a;. Typ  and  a;FTyp 
are  equivalent  if  and  only  if  x  and  x'  are  bound  by  the  same  let. 
The  argument  of  an  application  must  be  a  variable  so  that  we  can 
track  this  equivalence  through  function  applications.  Variables  are 
prevented  from  escaping  the  scope  of  A-abstractions  \x\T.e  by 
banning  free  occurrences  of  x  in  the  type  of  e. 

Proposition  2  (Decidability)  The  MiniFlint  typing  relation 
A  F  e  :  t  is  decidable. 

We  omit  the  dynamic  semantics  of  MiniFlint,  which  defines 
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lookup  on  interface  names  rather  than  on  method  names.  Second, 
when  casting  an  object  from  class  type  to  interface  type,  we  can 
select  the  itable  and  then  pair  it  with  the  object  itself.  This  avoids 
name  lookup  entirely  but  requires  minor  coercions  when  casting  to 
and  between  interface  types.  The  object  marked  (c)  in  Figure  9  is 
the  result  of  casting  an  SPt  object  (b)  to  the  Zm  interface. 

In  our  encoding,  we  use  coercions  to  achieve  fixed-offset 
method  selection  regardless  whether  we  are  using  a  class  or  inter¬ 
face  object.  From  our  intuition  and  experience  with  implementing 
functional  languages,  we  believe  the  cost  of  the  coercion  will  be 
paid  for  by  the  fast  method  accesses  it  enables.  Another  reason 
to  use  the  coercion  in  a  typed  implementation  is  that  the  dictio¬ 
nary  lookup  (and  caching  the  results  for  quicker  access  later)  is 
extremely  difficult  to  prove  type-safe. 

5  Translation 


Figure  9:  Object  layout,  (a)  and  (b)  are  instances  of  classes  Pt  and 
SPt,  respectively,  (c)  is  a  view  of  (b)  through  the  Zm  interface. 


locations,  stores  S,  values  v,  and  a  reduction  relation  <—¥  between 
term-store  pairs.  Since  runtime  behavior  does  not  depend  on  types, 
we  can  use  a  type-erasure  semantics;  this  implies  that  fold,  unfold, 
type  application,  package  creation,  and  package  projection  are  all 
no-ops  at  runtime.  However  for  proving  type  soundness  and  cor¬ 
rectness  of  the  translation  of  Javacito  to  MiniFlint  we  define  a 
reduction  semantics  on  typed  terms.  The  reader  is  referred  to  the 
accompanying  technical  report  for  details. 

4  Object  Layout 

We  start  by  examining  the  object  layout  used  in  our  encoding.  Fig¬ 
ure  9  depicts  the  runtime  representation  of  two  Java  class  objects 
and  an  interface  object.  The  classes  and  interfaces  are  taken  from 
the  example  in  Figure  1 . 

An  object  is  a  record  containing  a  virtual  table  (viable)  and  the 
private  fields.  The  viable  is  a  per-class  data  structure  containing 
pointers  to  the  methods  defined  in  the  class  or  inherited  from  the 
superclass. 

The  object  marked  (a)  is  an  instance  of  the  Pt  class,  which  has 
one  field  and  three  methods.  The  object  marked  (b)  is  an  instance 
of  the  subclass  SPt,  which  extends  Pt  with  a  scale  field  and  a  zoom 
method.  SPt  overrides  the  move  method;  the  others  are  inherited 
from  the  superclass  by  simply  pointing  to  the  same  code.  Note  that 
both  the  object  record  and  the  viable  of  (a)  are  prefixes  of  those 
in  (b).  This  property  allows  superclass  methods  to  access  objects 
created  by  subclasses  without  any  coercions. 

In  addition  to  its  four  methods,  the  viable  for  class  SPt  contains 
a  pointer  to  an  interface  table  (itable).  An  itable  is  merely  a  rear¬ 
rangement  of  some  of  the  methods  in  the  viable  to  correspond  to  an 
interface  which  the  class  implements.  Given  an  object  of  interface 
type,  we  know  nothing  about  the  shape  of  its  viable.  Thus,  we  use 
the  itable  to  give  a  consistent  view  of  the  methods  in  a  particular 
interface,  independent  of  the  underlying  viable. 

There  are  various  ways  of  locating  methods  in  interface  objects. 
One  technique,  used  in  the  Toba  ahead-of-time  compiler  [30],  is  to 
construct  a  per-class  dictionary  that  maps  method  names  to  offsets 
in  the  viable.  Another  idea,  implemented  in  the  CACAO  64-bit 
JIT  compiler  [22]  and  described  informally  elsewhere  [26],  is  to 
construct  an  itable  for  each  interface  we  implement  and  store  them 
all  somewhere  in  the  viable.  Although  [22]  is  not  clear  on  how  to 
use  the  itable,  there  appear  to  be  two  choices.  First,  we  can  search 
for  the  appropriate  itable  in  the  viable,  which  amounts  to  dictionary 


The  main  result  of  this  paper  is  a  type-preserving  translation  algo¬ 
rithm  that  compiles  a  well-typed  Javacito  program  into  a  term  of 
the  target  language  MiniFlint.  In  this  section,  we  first  informally 
describe  our  algorithm,  using  specific  examples  and  a  set  of  type 
and  term  macros  (see  Figures  10  and  11).  We  then  give  the  formal 
algorithm  (see  Figure  12)  and  show  that  it  is  both  type-correct  and 
sound. 

Each  class  declaration  is  separately  translated  into  a  let-bound 
existential  package.  A  skeleton  of  this  package  for  a  class  c  fol¬ 
lows: 


let  Xc  = 


Missing  components  in  the  package  have  been  replaced  with  boxes 
indicating  the  subsections  below  which  contain  further  detail. 

The  package  is  bound  to  Xc,  a  variable  obtained  directly  from 
the  class  name  c.  Formally,  Xm  is  a  map  from  ClassNames  to  Vars. 
In  examples,  we  distinguish  the  two  domains  using  different  fonts; 
the  package  for  class  Pt  is  bound  to  the  variable  Pt. 

The  type  component  (town)  of  the  package  contains  the  type  of 
the  private  fields  declared  in  class  c.  Outside  the  package,  this  type 
can  be  selected  using  dot  notation  (xc  -Typ)  but  it  remains  abstract. 
The  value  part  of  the  package  is  a  record  containing  an  extensible 
dictionary  (diet),  a  field  initializer  (init),  and  a  constructor  (new). 
We  say  diet  is  extensible  because  it  is  parameterized  by  row  types 
for  additional  fields  and  methods.  To  produce  a  vtable  from  an  ex¬ 
tensible  dictionary,  we  instantiate  these  row  types  to  empty.  Sub¬ 
classes  will  instantiate  them  according  to  their  own  new  fields  and 
methods. 

In  Section  5.1,  we  describe  the  encoding  of  private  fields  and 
field  initialization.  Section  5.2  covers  the  typing  issues  for  vtables 
and  objects.  Terms  for  method  invocation,  super,  new,  and  casts 
are  given  in  Section  5.3.  Inheritance  and  interfaces  are  handled  in 
sections  5.4  and  5.5,  respectively.  Finally,  Section  5.6  makes  some 
formal  claims  about  the  translation. 


5.1  Fields  and  privacy 

5.1.1  Own  fieids 

Let  the  fields  declared  in  class  c  be  represented  by  a  record  of 
ML-style  polymorphic  ref  cells  called  OwnFlds  [c] .  For  example. 
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OwnFlds[Pt]  —  {x:  Ref  Int}.  Using  a  more  efficient  fiat  mutable 
record  presents  no  problems,  but  for  simplicity  we  prefer  to  pre¬ 
serve  the  orthogonality  of  features  in  MiniFlint. 

A  tricky  issue  in  defining  OwnFlds  [c]  is  that  the  class  c  may 
contain  a  field  of  type  c,  whose  privates  are  also  visible  to  code 
within  class  c.  Imagine  a  class  List  which  has  fields  ‘int  data’ 
and  ‘List  next’.  Within  class  List,  it  is  legal  to  access  not  only 
‘this.next’  but  also  ‘this.next.next’  and  ‘this.next.next.next’,  and 
so  on,  ad  infinitum. 

This  implies  that,  in  the  general  case,  OwnFlds  [c]  must  be  a 
recursive  type: 

OwnFlds[c]  —  fitown-  OwnFldsGen[c]  {ObjGen[c]  town) 

where  ObjGen[c]  (see  Section  5.2.2)  is  the  type  of  a  class  object 
parameterized  by  the  type  of  its  private  fields.  (All  type  operators 
used  in  the  translation  are  listed  in  Figure  10.)  OwnFldsGen[c] 
takes  an  argument  ftwin>  creates  a  record  containing  ref  cells  for 
each  field  declared  in  c,  and  makes  any  fields  of  type  c  have  type 
<twin  instead.  For  the  class  List  mentioned  above, 

OwnFldsGen[List]  —  Aftwin-  {data  :  Ref  Int;  next :  Ref  ftwin} 

Substituting  this  into  OwnFlds  [List],  we  have: 

OwnFlds  [List]  =  jit  own  •  {data :  Ref  Int; 

next:  Ref  {ObjGen[List]  town)} 

As  required,  the  private  fields  of  List  include  a  next  field  which,  in 
turn,  is  an  object  whose  private  fields  include  a  next  field,  etc. 

5.1.2  All  fields 

An  object  contains  not  only  the  fields  of  its  own  class,  but  the  fields 
of  all  its  superclasses  as  well.  Field  visibility,  however,  is  based  on 
the  class  in  which  the  fields  were  declared.  Fields  from  each  class 
can  be  hidden  or  revealed  independently.  This  implies  that  we  need 
to  segregate  the  fields  by  the  class  in  which  they  were  declared. 

Let  AllFldsGenf  [c]  produce  the  type  of  the  segregated  record 
containing  all  fields  in  an  object  of  class  c.  In  the  example, 

AllFldsGenf  [Pt]  —  Xt  own  •  {Pt:  town  3" 

AllFldsGenf  [SPt]  —  Xt  own  •  {Pt :  Pt.Typ:  SPt :  town  3" 

where  Pt.Typ  is  the  abstract  type  of  private  fields  in  Pt.  town  can  be 
instantiated  to  OwnFlds  [c]  for  use  inside  the  class  c,  or  to  Xc-Typ 
outside,  where  Xc  is  bound  to  the  package  implementing  class  c. 

We  use  row  polymorphism  to  extend  superclass  field  record 
types  in  subclasses.  We  additionally  parameterize  the  above  op¬ 
erators  by  fsubFlds>  a  row  of  field  records  defined  by  potential  sub¬ 
classes.  Now  we  can  define  the  SPt  fields  using  the  Pt  fields  di¬ 
rectly: 

AllFldsGen[Pt]  —  Xt  own  •  -^^subFIds  •  ■  ^own  f  ^su  bFIds} 

AllFldsGen[SPt]  =  Xt  own  •  ■^fsubFIds- 

AllFldsGen[Pt]  Pt.Typ  (SPt :  town:  fsubFids) 

In  the  general  case,  the  segregated  record  type  is  produced  using 
two  mutually  recursive  operators  (see  Figure  10). 

5.1.3  Field  initialization 


UnpackObj  [c]  {t  own  5  0  :  ObjGen[c\  town  5  •)  — 

let  X  =  unfold  o  in  let  x'  =  a;.val  in  • 

PackObj  [c] 

(town5  tgubFIdsi  tgubVtabi  Self G 671  \c^  town  tgg^pi^s  tsubVtab) 


fold  {tsubFIds  —  tgubFIds' 

{tsubVtab  —  tgubVtab' 

self  >  SclfGcTl^C^  town  tgybFIds  tsubVtab) 

■  ^tgubVtab" [^]  town  tsubFIds  tsubVtab) 
as  ObjGen[c\  town 

UnpackView  [I]  {io  :  ExtView  [i],  •)  =  let  a;  =  unfold  io  in  • 
PackView  [i]  (fcobj;  o  :  fcobj;  =  Extitab  [i]  fcobj)  = 
fold  (tcobj : —  fcobj ' 

{cobj  =  0,  itab=  ity 

:  ViewTmpl[i]  (xj.Typ)  ExtView[i]  fcobj) 
as  ExtView  [i] 


GetFieldslc]  (o  :  ObjGenlc]  OwnFldslc])  — 

UnpackObj  [c] 

{OwnFlds[c],  0,  unfold  (unfold  (x'.val). fields. /c)) 

Gall[c]  {t  own  5  0 :  ObjGen  [c]  town  5  m)  — 

UnpackObj[c]  (town,  o,  unfold  (a;'.val).vtab./m  (x'.val)) 
Gall[i]  (_,  io  :  ExtView[i],m)  — 

UnpackView[i]  {io,  a;.val.itab./m  (x.val.cobj)) 

Oast  [c,  C  ]  (town ,  town  0 :  ObjGen[c]  town  )  — 

UnpackObj  [c]  (t  own  5  0, 

PuckObj^C^  (town  5  (^c-towni^-  Typ), 

NewPublic[c]  ExtObj[c\  a;'. Typ,  a;'.val)) 
if  c'  —  Super{c\ 

Gast[c,  i]  (t  own  5  -5  0 :  ObjGen[c]  town)  — 

UnpackObj  [c]  (t  own  5  0, 

PackView[i]  {SelfGen[c]  town  X.  Typ  a;'. Typ,  a;'.val, 
unfold  (a;'.val).vtab.tj)) 

Gast{i,i'\  (_,  _,  io  ■.  ExtView{{\)  — 

UnpackView{{\  {io, 

PackView{i'\  (a;. Typ,  a;.val.cobj,  a;.val.itab.tj/)) 

Prolog\c\  (tsubFIds;  fsubVtab;  {^j  ■T?)  ;  •)  ~ 

Aself :  ^eZ/Gen [c]  OwnFlds[c]  tsubFIds  fsubVtab- :  Tj.  )* 

let  pc\\s  =  PackObj{c\  {OwnFlds{c\,  tsubFIds;  fsubVtab;  self) 

in  • 

GDecl{c\  {classBody:  GlassGen[c]  OwnFlds[c],  •)  = 

\et  Xc  =  {town  =  OwnFlds[c],classBody  :  GlassGen[c]  town) 


lEeclli]  (•)  = 

let  a;i  =  {tstamp  =  Abs"^'''’''"M. 

Atcobj-A(t:/to6Gen[i]  t.^i^j.it 

:  Vtcobj.BabGen[i]  M  teobj 

-F  ItabGen[i]  tstamp  teobj) 


in  • 


As  a  part  of  the  code  for  class  c  we  define  a  function  init  which  _ Figure  1 1 .  Term  macros  used  in  the  translation. 

creates  a  record  of  ref  cells,  each  initialized  with  the  translated  field 
initializer  expression.  Every  class  Xc  exports  init  as  a  function  of 
type  1  — F  Xc-Typ.  For  the  Pt  class  in  our  example,  Pt.val.init  is 
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By  convention,  the  type  variables  used  in  the  translation  have  the  following  kinds  and  intended  meaning 
(in  the  context  of  a  template  with  parameter  c  or  i): 

fself  ^  -  type  of  receiver  object  itself  ftwin  ^  -  type  of  an  argument/result  of  receiver’s  class 

town  ”  -  type  of  private  fields  of  object  ^subFIds  ■■  -  type  of  row  of  field  records  of  a  subclass  of  c 

fsubVtab  ”  ^  [c]  _  jypg  mclhods/inlerfaces  of  a  subclass  of  c,  parameterized  by  type  of  self 

fcobj  ^  -  type  of  object  in  an  interface  view  f stamp  :: -  type  of  interface’s  “hidden  methods” 

Given  environments  Ti  and  £, 

Super  [c]  =  c  s.t.  c  c 
SCLabels[c]  —  {Ic'  \c  <%i  c'} 

MLabels  [c]  =  {Im  \  {m,  .)e-«£c}  U  {/*  |  c  i} 

MLabels  [i]  =  {Im  \  {m,  -)enei}  U  {h'  \  i  <n  i'} 

NewPublic[c]  =  At^vvin- -^^subVtab- -^^self- ((^»  ■ tselfl  )  (^m  ■  ^self  ^  I^lc  ^twinl)  (^subVtab  ^self)) 
where  i  ■<—  7r2(R(c)),  {m,T)  ■<—  7r2(£’(c)) 

0bjTTnpl\c^  — Atown •  Attvvin •  AtgubFIds • -^^subVtab' Myself •  I  Vt(lbGeTl\c^  ^twin  ^subVtab  ^self' 

fields:  town  fsubFldsJ 

06j(?6?t[c]  =  Atown  •  /ttwvjn  .  3tgy|3p|(jg.  Btgy  |3Y^3l3.  ObjTTnpl\C^  town  twvjy  tgy|3P|(jg  tgyl^Ytab 

ExtObj[c]  —  ObjGen[c\  (xc-Typ) 

SelfGeTl\c^  —  Atown- Atgyi^pi^jg.  Atgyj^Y^ab*  GbjTmpl^e^  town  (GbjGeTl\c^  town)  tgyi^pi^jg  tgyj^y^ab 

OwnFldsGen[c]  =  Attwin- {(t/ :  Ref  ([Tjc  ttwin))*} 
where  {/,  T)  ■<-  7ri(£'(c)) 

OwnFlds[c]  —  fitomn-  OwnFldsGen[c]  {ObjGen[c]  town) 

AllFldsGen[c]  —  Atown-  Atgybpids-  AllFlds[Super[c]]  {Ic  :  town!  tgybpids) 

AllFlds[c]  —  Xtsubf\ds- ■^iiPid,sGen[c]  (xc-Typ)  tgybpids 

VtabGen[c]  =  Attwin- Atgybvtab- ■^tgeif.  Ra:tFto6[5«per  [c]]  {NewPublic[c]  ttwin  tgybvtab)  tseif 
ExtVtab[c\  =  Atgybvtab- ■^tgeif.  VtabGen[c]  Exi:Obj[c\  tgybvtab  tseif 

and  Rdiit  Vtflt)  [Object]  —  AtgybVtab* -itself- ftgybVtab  tself^ 

DictGen  [c]  =  At  own  •  -^^subFIds-  -^^subVtab- 

ViahGen[c]  {OhjGen[c]  town)  (Atgeif- {SelfGen[c]  town  tgubPlds  ^subVtab) 
Gl(lSsG67l\d\  —  Atown-  {diet  I  VtsubFIds"  ^^subVtab"  D%ctG67l\d\  town  tgubFIds  ^subVtab' 

\x\\t\l  ^  OwnFldsGen\c\  (OhjGen\(^  town  )  I 
new:  1  —¥  ObjGen[c\  town} 

It(lbTTnpl[{\  — Atstamp- At-tvyin  •  Atcobj- •Fxtltttb[i]  t^obj')  (^m  ■  tcobj  ^  ttwin*)  tstamp} 
where  f'  f—  ?t(f),  {m,T)  f—  7r2(£’(f)) 

ViewTmpl[i]  =  Atstamp.  Attwin- Atcobj- {cobj :  tcobj: 'tab  : /to6rmp/[t]  tstamp  ttwin  tcobjJ 
ViewGen[i]  =  Atstamp./ifttwin- 3tcobj-  ViewTmpl[i]  tstamp  ttwin  tpobj 
ItabGen[i]  —  Xtsump- Xteob]- ^idbTmplli]  tstamp  {ViewGen[i]  tstamp)  t^obj 

ExtView[i]  —  ViewGen[i]  (xj.Typ) 

Extltab[i]  —  Xteob'y  Ii(ibGen[i]  (xj.Typ)  t^obj 

[tVi  •  •  •  ffn tyjty'  —  '^ttwin--^-  ItVil/y'  ^twin  y  •  •  •  ItVnl/y'  ttwin  [fy]/y'  ^twin 

{ttwin,  ifry  =  ry' 

ExtObj  [c]  otherwise,  if  ly  =  c 
Ext  View  [f]  otherwise,  'Aty  —  i 

Figure  10:  Type  operators  used  in  the  translation. 


(trans-var) 

X  e  dom(r) 

F;  _  h  a;  :  r(a;)  x 


(trans-new) 


nh  c' 


c  h  new  c' 


r {xc  {}).new  {},  if  c'  =  c 
^Xc'-wal-new  {},  if  c' c 


(trans-get) 

7i’,  5;  F;  c  h  exp  :  ry'  e 
ty'<nc  {f,ty)^ec 

;  5;  F;  c  h  c  >exp.  f  :  ty 

!  {{GeiFields[c] 


(trans-set) 

'H',  5;  F;  c  h  expi  :  ry'  ei  ry'  <5^  c 
?f;  £■;  F;  c  h  exp2  :  fy 62  {f,ty)^%c 

%•,  S\T\c\-  c  t>exp-, .  f  =  exPo  :  ry 

( GefF’ie/rfs [c]  (ei))J/  1=62 


(trans-class) 

;  £■;  0;  c  h  eyp^  :  tyj  ej 

7i’,  £’,  this  :  c,  ArgTypes  {c]{msigj)\  c  h  eyp^-  :  RsHType{msigj)  e'- 
£’,  c;  c' 

1“  {(%■  fj  =expj-,y  (msig.  exp'jY} 

Axc  :  1  — y  ClassGen[c]  OwnFlds[c].  A_:  1. 
let  diet  =  Atgyi^pi^jg.  Atgyi3Y^3i3. 
let  super  =  a;y/.val. diet  [Zc  :  0«^n-FWs[c]:tsybFlds] 

[NewPublic[c]  {ObjGen[c]  OwnFlds[c])  tsubVtab] 

in  let  newdiet  = 

<{lmj  =Prologlc]  (tsubFids,  ^subVtab,  ArgTypes  [c]{msigj),  e'  ))*} 
in  MkDict[c]  (super,  newdiet) 
in  let  init  =  A_:  1.  {(//^.  =ref  ej)*} 
in  let  new  =  A_:  1. 

PackObj[c]  [OwnPldslc], 

fold  {vtab  =  diet 

fields  =  {(/e”  =  Xc"  .val.init  {yy" ^sclms[c'] ^  j 

Ic  =  init  {}} 

as  SelfGen[c]  OwnFlds[c]  fjybFlds  fsubVtab) 
in  {diet  =  diet,  init  =  init,  new  =  new} 

where  ArgTypes  [c]{ty  m  {ty^.  xuT)  -  {xu  :  Iffifelc)* 

RsHTypeity  m  {ty^  xD)  =  ty 


(trans-body) 


\/j  e  F;  c  h  expj  :  ty^  ej 


H;£;T-c\- 

{  expH  •  •  •  ^exp^  }  :  ry„  (ei:  . . . :  e„) 

(trans-call) 

'H-,£-,r-, 

c  F  exp  :  ty'  e 

{m,  {ty^ 

■  ■  -  G))^-Hety' 

Vi  e  {F 

..n}.  FF;  £■;  F;  c  F  expj  :  tyj  ej 

FF;£’;F;cF 

exp.m  (expj  . . .  exp  )  :  ty 

Gall[ty']  {Own[c]  {ty'),  e,  m)  {cj)* 

(trans-cast) 

FF;, 

F;  F;  c  F  exp  ■.  ty'  e  ty'  -<-h  ty 

F;  cF  {ty)exp  :  ty 

Cost [ry' , ry]  {Own[c]  {ty'),  0«m[c]  (ty),  e) 


.  „  rw  N  (OwnPldslc],  if ry  =  c 

where  0«m  [c](ry)=|^^_.ryp 


(trans-class-deci) 

%'■,  S'-,  .  Fjj  classc  extc'  impFKry^  fj  =expy,y  {msig^  exp'  )*}  ^ 

H-,  £■,  C-,  c'  \-  {{tyj  fj  =expy,)*  {msigj  exp})*}  e 

n-,  £\- P'^n"  ■,£"■,  e' _ 

Ft';  £'  h  classc  extc'  impi*{(ryj  fj  =expyf)*  {msigj  exp})*}  p 
n"-,  £"■,  GDecl[c]  (Yy  e  {},  e') 

(trans-iface-deci) 

Ft';  £'■,  _  F|j  interface!  extti  . .  .in  {msigi\ . . .  msig.^\}  ^'H\£\- 

Ft;  g  l-p  -^Ft";  £"■,  e _ 

Ft';  £'  F  interface!  ext!i . . .  !„  {ms!gi; . . .  msig.^\}  p 
£"■,  IDecl[{\  (e) 


(trans-super) 

c  c  (m,  (ryi  . . .  ty„  -F  ty))€-HSC 
V}  e  {!..«}.  Ft;  f;  F;  c  F  exp^  :  ty^  Cj 

Ft;  f;  F;  c  F  c  ■  this  >  super  .m  (expj  . . .  exp^)  :  ty 

UnpackObf{c]  (this,  super. (x'.val)  (cj)*) 


Figure  12:  Translation  of  Javacito  to  MiniFlint. 


A_ :  1.  {x  =  ref  0}.  To  create  the  segregated  record  containing  the 
fields  of  each  superclass,  we  simply  call  the  init  functions  for  each 
superclass.  Suppose  the  following  code  is  inside  the  SPt  package: 

let  init  =  A_:  1.  {scale  =  ref  1} 
in  . . .  {Pt=  Pt. val.init  {},  SPt  =  init  {}}  . . . 

Then  the  type  of  the  record  will  be  AllFldsGen^Pt]  applied  to 
OwnPlds  [SPt]  and  the  empty  row  Abs^,  where  the  label  set  L  is 
{Pt,SPt}. 

5.2  Types  used  in  the  translation 
5.2.1  Virtual  tables 

Encoding  the  type  of  the  vtable  of  a  class  is  a  bit  trickier  than  encod¬ 
ing  the  fields.  We  need  types  for  the  self  argument  and  for  method 
arguments  of  the  same  class.  The  latter  is  not  simply  the  same  type 
as  external  class  objects  because  the  private  fields  need  to  be  acces¬ 
sible. 


For  now,  we  will  abstract  over  all  of  these  types,  and  show  how 
to  resolve  them  in  the  next  section.  VtabGen  [c]  produces  the  type 
of  a  vtable  for  class  c.  Here  is  the  parameterized  type  for  the  vtable 
ofPt: 

(?6?t  [Pt]  —  At^win-  AtgybVtab"  Atgelf.  { 
max  :  fg^if  — F 

ftwin  ^  ftwin » 

move  :  fgeif  — F  Int  — F  1; 
bump  :  fggif  — F  1; 
fsubVtab  fself} 

where  is  the  type  of  non-self  arguments  or  results  of  class  Pt. 
fgubvtab  is  the  type  of  a  row  of  new  methods  in  some  subclass  of 
Pt,  parameterized  by  fg^if,  the  type  of  the  receiver  object  itself. 

The  tail  of  the  vtable  is  produced  by  applying  the  fsubVtab  oP" 
erator  to  fg^if.  The  reason  is  that  new  methods  added  by  subclasses 
will  need  to  use  exactly  the  same  self  type.  To  achieve  this,  we  pa¬ 
rameterize  the  row  type.  The  parameterized  type  of  the  SPt  vtable 
can  now  be  defined  using  that  of  Pt  (we  are  ignoring  the  Zm  inter- 
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face  for  now): 

[SPt]  =  -^ftwin-  -^fsubVtab'  -^fself* 

VtabGen\P\\  ExtObj\Pt] 

(Afjgif.  (zoom  :  — >•  Int  — >■  1; 

^subVtab  ^self))  ^self 

where  ExtObj  [c]  is  the  type  of  an  external  view  of  an  object  of 
class  c,  with  all  private  helds  hidden.  It  is  instructive  to  expand 
the  body  of  this  operator,  making  it  independent  of  the  superclass 
viable  type  operator: 

[SPt]  —  At^vvin-  -^fsubVtab'  -^fself"  ^ 

max  :  fjelf  ExtObj  [Pt]  — >•  ExtObj  [Pt]; 
move  :  fjelf  Ift  1: 

bump  :  fself  ~t-  1; 

zoom  :  fseif  — >  Int  1; 

^subVtab  ^self} 

In  the  viable  for  class  Pt,  it  is  clear  that  the  argument  of  max  is 
special  because  methods  in  the  viable  can  access  its  private  fields. 
In  the  viable  for  class  SPt,  however,  the  type  of  the  argument  to 
max  becomes  indistinguishable  from  that  of  any  other  Pt  object 
seen  from  outside,  regardless  whether  we  inherit  or  override  max. 

As  indicated  by  the  types  of  functions  in  the  viable,  we  are  us¬ 
ing  curried  function  application  to  implement  methods  with  argu¬ 
ments.  This  is  only  to  keep  the  target  calculus  as  simple  as  possible. 
Since  we  never  do  partial  function  application,  switching  to  multi¬ 
argument  functions  would  present  no  problems. 

5.2.2  Object  types 

We  combine  the  field  and  viable  operators  in  ObjTmpl[c],  a  pa¬ 
rameterized  template  for  generating  class  object  types  (see  Fig¬ 
ure  10).  Note  the  hxed  point  for  resolving  fjelf’  ensures  that  a 
method  selected  from  the  vtab  of  an  object  o  will  accept  o  as  its 
self  argument. 

This  template  may  be  used  to  generate  several  variations  of 
class  object  types.  In  order  to  understand  how  we  resolve  the  pa¬ 
rameters  and  which  variations  are  needed,  we  must  briefly  examine 
the  extensible  dictionary  for  a  class: 

diet  =  Atsutipids.  Afsubvtab- 

{max  =  Aself :  (^et/GenlPt]  OwnFlds[Pt]  fsubFids  fsubVtab)- 
Aother :  ( ObjGen [Pt]  OwnFlds  [Pt]) . 

...} 

The  difference  between  SelfGen[c]  and  ObjGen[c]  is  that  the  self 
type  should  also  be  extensible;  this  is  why  it  takes  fsubFIds 
fsubVtab  parameters.  ObjGen[c],  on  the  other  hand,  is  an  ob¬ 
ject  generated  by  some  unknown  class.  It  might  have  some  addi¬ 
tional  fields  and  methods  but  we  cannot  know  what  they  are.  The 
ObjGen  [c]  operator  closes  the  object  type  by  existentially  quanti¬ 
fying  the  two  tail  variables  and  taking  a  hxed  point  to  resolve  tt„\n 
(see  Figure  10). 

The  only  parameter  of  ObjGen[c]  is  town,  the  type  of  the 
private  helds  in  c.  Inside  the  class’s  dictionary,  this  is  instanti¬ 
ated  to  OwnFlds  [c]  so  that  the  private  helds  are  accessible.  Out¬ 
side  the  class,  town  is  instantiated  instead  to  Xc.Typ.  Thus,  the 
type  of  a  class  object  viewed  from  the  outside  is  ExtObj  [c]  — 
ObjGen[c]  (xc-Typ). 

Finally,  SelfGen[c]  takes  the  template  and  instantiates  to 
ObjGen[c]  town,  but  leaves  the  tails  as  parameters.  This  ensures 
that  the  private  helds  of  self  will  match  those  of  other  arguments  of 
the  same  class. 


5.2.3  Name  equivalence 

By  using  abstract  types  in  the  fields  record  of  every  object  type,  we 
encode  the  class  hierarchy  and  preserve  Java’s  name  equivalence. 
Recall  that  two  abstract  types  x.Typ  and  x'.Typ  are  equivalent  if 
and  only  if  x  and  x'  are  bound  by  the  same  let.  Thus,  structurally 
equivalent  classes  will  translate  to  different,  incompatible  object 
types. 

At  hrst,  it  might  seem  that  a  target-calculus  programmer  could 
sabotage  name  equivalence.  Anyone  is  permitted  to  use  Xc.val.init 
to  create  records  containing  values  having  the  abstract  types  of 
c  and  its  superclasses.  Indeed,  a  programmer  can  create  a  fields 
record  that  looks  like  it  belongs  to  class  c,  package  it  with  his  own 
devious  methods,  and  have  it  masquerade  as  an  object  of  class  c. 
But  this  is  not  as  subversive  as  it  seems;  it  is  merely  subclassing! 
A  Javacito  programmer  could  have  achieved  the  same  feat. 

5.3  Implementing  the  dictionary 

So  far,  we  have  concentrated  mostly  on  typing  issues.  In  this  and 
the  next  section,  we  explore  the  term-level  translation  of  method 
bodies  and  expressions.  We  start  with  object  creation,  then  move 
on  to  method  bodies,  method  invocation,  and  casts. 

5.3.1  Object  creation 

For  the  class  Pt  the  extensible  dictionary  has  this  shape: 
diet  =  AfsubFIds-AfsubVtab- 

{max  =  Aself . move  =  Aself . bump  =  Aself _ } 

The  vtab  component  of  an  object  is  a  dictionary to  the  object’s 
type;  we  obtain  it  from  diet  by  instantiating  to  empty  rows.  To  get 
an  object  of  class  Pt  we  then  pair  vtab  with  the  fields  record  (see 
section  5.1.3),  and  fold  to  fix  the  type  of  self: 

let  po  =  fold  {vtab  =  diet  [Abs^”]  [Abs^™ ], 
fields  =  {Pt  =  init  {}}} 

as  Self  Gen  [Pt]  OwnFlds  [Pt]  Abs^”  Abs^™ 

where  Lc  —  {Pt}  and  Lm  =  {max,  move,  bump}. 

This  object  can  be  passed  as  the  self  parameter  to  methods  in 
the  vtab,  but  its  dynamic  type  is  exposed — anyone  can  see  that 
fields  has  only  one  component,  Pt.  In  order  to  make  po’s  type 
indistinguishable  from  the  type  of  objects  created  by  subclasses  of 
Pt,  we  need  to  hide  the  tails: 

let  pi  =  fold  (tsubFids  =  Abs^” , 

(fsubVtab  ~  Abs  , 
po:SelfGen[Ft]  town  Abs^”  fsubVtab) 

■  ^tsubVlab'^^^^f^^rt [Pt]  town  tgubFIds  fsubVtab) 

as  ObjGen[Pt]  town 

where,  as  usual,  town  is  OwnFlds[Ft]  if  this  expression  is  inside 
the  Pt  class  package,  or  Pt.Typ  otherwise.  The  final  fold  resolves 
the  twin  type,  as  in  the  non-self  argument  of  max.  This  argument 
has  the  same  private  helds  accessible  as  does  self,  but  its  additional 
helds  and  methods  are  hidden  since  it  might  have  a  different  dy¬ 
namic  type. 

This  operation,  in  its  general  form,  is  encapsulated  as  the  term 
macro  PackObj  [c]  in  Figure  11.  It  is  used  not  only  for  object  cre¬ 
ation  but  for  various  forms  of  casting.  Its  inverse  is  UnpackObj  [c] . 

To  implement  the  new  function  in  the  class  package,  we  need 
to  wrap  a  A  around  the  two  let-bindings  above: 

let  new  =  A_:  1.  let  po  =  . . .  in  let  pi  =  . . .  in  pi 
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Now  we  can  simply  select  and  apply  this  function  whenever  we 
encounter  ‘newPt’,  unless  it  appears  in  the  body  of  a  method  in 
class  Pt  itself.  In  this  case,  there  is  a  circularity.  The  new  function 
refers  to  diet,  and  a  component  of  diet  refers  to  new.  (Similarly, 
‘new  Pt’  might  appear  in  a  field  initializer  expression,  in  which  case 
there  is  a  circularity  between  init  and  new.) 

We  resolve  this  by  passing  ?i  frozen  class  record  to  the  call-by- 
value  fixed  point  combinator  Yy : 

let  Xc  =  (town  ::n  =  OwnFlds  [c], 

Yy  [ClassGen[c]  OwnFlds[c]\ 

{Xxc  :  1  — t  ClassGen[c]  OwnFlds[c]. 

A_:  1.  {dict=  . . . ,  init=  . . . ,  new=  . . .  }) 

O) 

in  . . . 

where  GlassGen[c]  is  the  type  of  the  class  record  for  class  c  (see 
Figure  10). 

Using  this  technique,  a  method  inside  diet  can  create  a  new 
object  of  class  c  with  the  expression  {xc  {}).new  {}.  Note  that  the 
fixed  point  is  inside  the  package,  so  that  the  private  fields  of  the 
new  object  are  exposed.  If  we  applied  the  fixed  point  to  the  whole 
package,  then  ‘(newPt).x’  would  be  illegal. 

5.3.2  Method  bodies 

As  demonstrated  in  Section  5.2.2,  methods  are  translated  into  cur¬ 
ried  functions  with  an  explicit  self  parameter,  self  is  different  from 
other  objects  (including  twin  parameters  such  as  other)  because  it 
has  already  been  unpacked;  it  is  just  a  recursive  record.  To  access 
a  field  of  other,  one  needs  to  unfold,  unpack,  unpack  again,  unfold 
again,  and  finally,  select  the  fields  record.  To  access  a  field  of  self, 
only  the  last  two  steps  are  necessary. 

For  convenience,  we  translate  object  operations  (e.g.,  method 
invocation,  casts,  field  selection)  uniformly,  regardless  whether  the 
object  is  self.  To  support  this,  we  begin  each  method  by  packing 
self  to  make  it  the  same  shape  as  any  other  object: 

let  this  =  Pocfc06j[c]  (OtnnF’Ws[c],fsubFlds)fsubVtab)Self)  in  . . . 

Now  selecting  a  field  from  this  is  the  same  as  selecting  from 
other.  An  optimization  phase  can  easily  remove  any  consecutive 
pack/unpack  or  fold/unfold  expressions. 

5.3.3  Method  invocation  (invokevirtual) 

To  invoke  method  m  on  an  object  o,  we  just  unfold,  unpack,  unpack 
again,  unfold  again  (all  type  manipulations),  then  select  and  apply: 

let  X  =  unfold  o  in  let  x'  =  x.val 
in  unfold  (a;'.val).vtab.fm  (x'.val) 

The  result  can  then  be  applied  to  any  additional  arguments. 

Since  method  invocation  is  not  atomic,  we  must  prevent  users 
from  selecting  a  method  from  one  object  and  applying  it  to  some 
other  object.  This  error  is  prevented  because  the  tails  of  the  fields 
and  vtab  records  of  o  are  abstract  types.  In  the  code  above, 
x'.val.vtab./m  will  have  a  type  which  includes  the  abstract  types 
x.Typ  for  the  additional  fields  in  o  and  x'.Typ  for  the  additional 
methods  in  o.  Since  the  only  object  which  can  have  these  types  is 
x'.val  itself,  cannot  be  applied  to  any  other  term. 

5.3.4  Upward  cast 

The  purpose  of  an  upward  cast  is  to  allow  an  object  of  class  c  to 
masquerade  as  an  object  of  its  superclass,  c  .  Since  we  already 


keep  objects  inside  nested  packages  to  hide  the  tails  of  the  fields 
and  vtab  records,  we  can  simply  repackage  an  object,  hiding  the 
components  of  fields  and  vtab  that  are  part  of  c  but  not  c' . 

In  the  following  example,  let  o  have  type  ExtObj  [SPt]: 

let  X  =  unfold  o 
in  let  x'  =  x.val 

in  PackObj[Pt]  (Pt.Typ, 

(SPt :  SPt.Typ:  x.Typ), 

(Afseif-  (zoom  :  — >•  Int  — >■  1; 

x'.Typ  fself)), 

x'.val) 

The  result  is  a  value  with  type  ExtObj  [Pt].  We  achieved  this 
using  type  manipulations  only;  the  underlying  object  record  was 
not  touched.  Dynamically,  the  entire  process  is  a  no-op.  See  the 
Cast  [c,  c']  macro  in  Figure  1 1  for  the  generic  form. 

5.4  Inheritance  and  super 

Since  each  class  package  exports  an  extensible  dictionary,  imple¬ 
menting  inheritance  is  straightforward.  A  class  c  can  select  the 
extensible  dictionary  of  its  superclass,  and  instantiate  the  tails  to  its 
own  new  fields  and  methods.  For  example,  inside  the  SPt  diet,  we 
bind  super  as  follows: 

diet  =  Afsuijpids.AfsijijVtgij. 

let  super  =  Pt.val. diet  [SPt :  OtnnF’/rfs[SPt]:  tsubpids] 

[Afseif-  (zoom  :  fjeif  — t  Int  — F  1; 

fsubVtab  fself)] 

in  {max  =  super. max, 

move  =  Aself. . . .  super. move  self . . . , 
bump  =  super.bump, 
zoom  =  Aself. . . .  } 

Inheritance  is  implemented  simply  by  selecting  methods  out  of  the 
super  dictionary  (as  in  max  and  bump,  for  example)  and  placing 
them  in  the  diet  of  the  new  subclass.  Source  language  invocations 
through  super  are  implemented  by  selecting  a  method  from  the 
super  dictionary  (as  in  move).  After  the  one-time  type  application 
to  produce  super,  no  coercions  are  necessary. 

5.5  Interfaces 

Recall  our  approach  to  implementing  interfaces  (see  Section  4).  An 
itable  is  an  arrangement  of  some  of  the  methods  in  a  vtable,  made 
to  correspond  to  an  interface  which  the  class  implements.  Given 
an  object  of  interface  type,  we  know  nothing  about  the  shape  of 
its  vtable.  Thus,  we  use  the  itable  to  give  a  consistent  view  of  the 
methods  in  the  interface,  independent  of  the  underlying  vtable. 

5.5.1  Interface  types 

The  itable  is  a  record  of  functions,  just  like  the  vtable.  For  the  Zm 
interface  in  our  example, 

ItabGen,  [Zm]  =  Aftwin-  Afeobj-  {zoom  :  fcobj  Int  — >■  1} 

where  f^obj  is  the  type  of  the  underlying  class  object  (to  simplify  the 
presentation,  we  have  omitted  the  fstamp  parameter  from  ltabGen\ 
see  section  5.5.4  and  figure  10).  As  before,  ftwin  is  the  type  for 
non-self  arguments  of  interface  Zm  (there  are  no  such  arguments  in 
this  particular  example). 

An  interface  object  (or  view)  is  a  pair  containing  some  class 
object  (cobj)  and  the  appropriate  itable  (itab): 

Vie'wTmpl,\Zm\  —  Aftwin-  Afeobj- 

{cobj :  fcobjl  itab  :  ItabGen,  [Zm]  ftwin  tcobj} 
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We  resolve  the  t^obj  parameter  using  an  existential  type.  We  should 
be  able  to  select  a  method  from  the  itable  and  pass  cobj  as  its  self 
parameter.  We  do  not  know  (or  it  should  not  matter)  what  f^obj 
actually  is: 

Views  [Zm]  —  /:iftwin- 3fcobj-  ViewTruplslZvc^  ftwin  fcobj 

Finally,  we  resolve  ftwin  using  a  recursive  type.  If  the  interface  Zm 
had  any  methods  with  arguments  of  type  Zm,  they  would  need  to 
look  the  same,  though  they  might  have  a  different  hidden  f^obj  typs- 

5.5.2  Invokeinterface 

To  invoke  method  m  on  an  interface  object  io,  we  bind  x  to 
unfold  fo,  apply  x.val.itab./m  to  x.val.cobj,  and  apply  the  result 
to  any  additional  arguments.  Note  that  this  is  essentially  the  same 
select-and-apply  operation  that  is  used  for  ordinary  method  invoca¬ 
tion  (section  5.3.3). 

Here  again,  the  abstract  type  prevents  users  from  doing  any¬ 
thing  devious  with  either  itab  or  cobj.  The  function  x.val.itab./m 
wants  an  argument  of  type  x.Typ.  The  only  possible  expression  of 
that  type  is  x.val.cobj. 

5.5.3  Interface  casts 

We  have  not  yet  mentioned  how  to  create  an  interface  object  from  a 
class  object.  As  discussed  in  section  4  (and  illustrated  in  Figure  9), 
the  vtable  can  actually  contain  the  itable  for  all  the  interfaces  a  class 
implements.  Then,  casting  to  an  interface  type  is  just  a  matter  of 
selecting  the  appropriate  itable  and  pairing  it  with  the  object  itself. 
For  example,  the  SPt  diet  would  contain  the  itable  for  Zm: 

diet  =  Afsiji3|i|£is.AfsubVtab' 

let  super  =  Pt.val.dict  [...][..] 
in  let  newdict  = 

{move  =  Aself. . . .  super. move  self. . . , 
zoom  =  Aself. . . .  } 
in  {max  =  super. max, 
move  =  newdict.move, 
bump  =  super,  bump, 

=>  Zm  =  {zoom  =  newdict. zoom}, 
zoom  =  newdict. zoom} 

Once  this  itable  is  present,  the  following  code  can  be  used  to  coerce 
an  SPt  object  o  to  a  Zm  interface  object: 

let  X  =  unfold  o 
in  let  x'  =x.Ma\ 

in  fold  {fcobj"^  =  '5e(f(?er*[SPt]  SPt.Typ  x.Typ  x'.Typ, 
{cobj  =a;'.val, 

itab  =  unfold  (a;'.val).vtab.Zm} 

:  Hfew Tmp/a [Zm]  VieWs[Zm\  fcobj) 
as  View,  [Zm] 

For  this  example,  we  have  assumed  that  this  code  is  contained  in 
some  class  other  than  SPt.  If  we  were  inside  SPt,  then  SPt.Typ 
above  would  be  replaced  with  OwnF’/ds  [SPt].  See  the  Cast[c,  i] 
macro  in  Figure  1 1  for  the  generic  form. 

In  section  5.3.4,  we  noted  that  upward  casts  consisted  of  type 
manipulations  only.  In  contrast,  casts  to  and  between  interfaces  re¬ 
quire  a  simple  coercion  at  runtime.  This  is  not  surprising;  types  can 
have  multiple  superinterfaces.  A  single  fixed-offset  vtable  record 
cannot,  in  general,  meet  the  needs  of  all  possible  superinterfaces. 
Either  simple  coercions  or  some  form  of  dynamic  name  lookup  are 
required. 


5.5.4  Name  equivalence  for  interfaces 

Like  class  types,  interface  types  in  Java  use  name  equivalence.  In 
the  formal  treatment,  we  translate  interface  declarations  into  a  let- 
bound  package  for  the  purpose  of  distinguishing  different  struc¬ 
turally  equivalent  interfaces.  Each  interface  package  has  a  hidden 
stamp  type  (xj.Typ);  the  type  of  an  itable  for  that  interface  will 
contain  Xj.Typ  as  a  component.  The  body  of  the  package  (xj.val) 
is  the  identity  function,  but  the  argument  type  accepts  structurally 
equivalent  dictionaries  and  the  return  type  contains  the  stamp  type. 

This  technique  serves  not  only  to  distinguish  between  differ¬ 
ent,  structurally  equivalent  interface  types,  but  it  also,  in  a  sense, 
prevents  forgery.  We  cannot  construct  an  object  that  will  satisfy 
interface  i  unless  the  interface  package  Xi  is  itself  accessible. 

5.6  Formalization 

The  formal  algorithm  translating  a  (well-typed)  Javacito  program 
p  to  a  MiniFlint  term  is  presented  in  Figure  12,  using  meta¬ 
language  macros  defined  in  Figure  11.  Some  of  the  macros  are 
parameterized  contexts;  free  variables  of  a  term  placed  in  the  hole 
may  be  captured  in  the  context.  The  sequents  representing  the 
translation  are  based  on  those  defining  the  static  semantics  of  Javac¬ 
ito  (Figure  5).  In  the  translation  of  a  class  declaration  we  have 
omitted  the  straightforward  but  tedious  definition  of  the  macro 
MkDict\c\,  constructing  a  class  dictionary  out  of  a  superclass  dic¬ 
tionary  super,  specialized  to  its  subclass  c,  and  a  record  newdict  of 
the  newly  defined  methods  of  c.  The  function  of  MkDict  [c]  is  to 
collect  selected  components  of  super  (inherited)  or  newdict  (over¬ 
ridden  or  new)  into  interface  dictionaries  and  a  class  dictionary. 

The  translation  is  total  on  type-correct  Javacito  programs,  and 
it  maps  them  to  type-correct  MiniFlint  terms. 

Proposition  3  (Type  preservation)  {f0;0;0  Fp  p  :  ty  ^ 

then  0;  0  F  p  72.;  e  and  0  F  e  :  |/y]c  {},  where  c  ^ 

dom{'H). 

The  semantic  correctness  of  the  translation  is  shown  by  establish¬ 
ing  a  mapping  from  Javacito  runtime  configurations  to  MiniFlint 
runtime  configurations  which  maps  value-configurations  to  value- 
configurations,  and  proving  that  a  Javacito  reduction  step  corre¬ 
sponds  with  respect  to  this  mapping  to  finitely  many  MiniFlint 
reduction  steps  [23];  we  omit  the  proofs  due  to  space  constraints. 

6  Extensions 

Many  features  of  Java  can  be  accommodated  directly  in  our  frame¬ 
work,  but  were  left  out  in  order  to  simplify  the  formal  presentation. 

6.1  Unordered  records 

We  have  used  ordered  records  in  our  translation  of  Javacito,  but  we 
can  add  unordered  records  [38]  into  the  target  language  as  well: 

types  T  . . .  \  -It} 

terms  e  ...[■!(/  =  e)*  } 

Here,  denotes  the  unordered  record  type  where  t  must  be  of 
kind  J?*,  and  ■!(/  =  e)*}  is  the  unordered  record  term  which  (un¬ 
like  ordered  records)  requires  runtime  dictionary  construction.  Se¬ 
lecting  from  an  unordered  record  uses  the  same  syntax  as  selecting 
from  an  ordered  record  but  it  requires  runtime  dictionary  lookup. 

Some  Java  compilers  use  the  same  representation  for  objects 
of  interface  types  as  for  those  of  class  types,  so  casting  to  interface 
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type  requires  no  runtime  operations.  Unordered  records  can  be  used 
to  support  this  representation  by  collecting  all  the  itable  entries  of  a 
vtable  into  a  separate  unordered  record,  itself  an  element  of  the  still 
ordered  vtable.  Casting  an  object  into  interface  type  only  requires 
repackaging  it  (a  runtime  no-op)  to  hide  those  entries  not  exported 
by  the  current  interface. 

6.2  Access  control 

Javacito  only  allows  private  fields  and  public  methods,  but  other  ac¬ 
cess  scoping  schemes  can  also  be  supported.  Private  methods  can 
simply  be  let -bound  within  the  class  dictionary  since  they  can  nei¬ 
ther  be  called  from  subclasses  nor  overridden.  Public  fields  could 
be  placed  directly  in  an  object’s  fields  record,  without  needing  to 
be  segregated. 

Protected  and  package  scopes  require  adding  a  notion  of  Java- 
style  package  into  MiniFlint.  Otherwise,  they  can  be  supported 
using  the  technique  proposed  by  Moby  [16].  The  diet  and  new 
fields  in  our  class  encoding  roughly  correspond  to  the  class  view 
and  the  object  view  in  Moby.  If  we  export  a  class  outside  its  defi¬ 
nitional  package,  all  protected  methods  and  fields  should  be  hidden 
from  the  object  view  but  not  the  class  view  while  those  of  package 
scope  should  be  hidden  from  both. 

6.3  Dynamic  casts  and  exception 

Dynamic  casts  and  runtime  type  identification  can  be  addressed  us¬ 
ing  extensible  variant  types  similar  to  the  exn  data  type  in  Stan¬ 
dard  ML.  We  add  a  new  extensible  variant  declaration  inside  each 
class  c  or  interface  i  (for  simplicity  we  borrow  the  syntax  of  excep¬ 
tion  declaration  in  ML): 

exception  Tag^  of  ObjGen  [c]  town ; 
or  exception  Tag^  of  Vie'wGen{i]  fstamp; 

Suppose  our  extensible  variant  type  is  named  Tagged;  then,  inside 
each  class  declaration  of  c,  we  add  a  method  named  allviews  that 
takes  the  self  object  and  returns  a  list  of  Tagged  values.  More 
specifically,  the  allviews  method  takes  the  self  object  s,  and  for 
each  ancestor  class  c'  (and  interface  i')  it  upward-casts  s  into  an 
object  of  d  (or  ('),  packages  it  with  the  Tag^,  (or  Tag^, )  flag,  and 
then  returns  the  entire  list  of  resulting  Tagged  values.  We  also  add 
the  specification  for  the  allviews  method  into  each  interface  so  that 
it  can  be  invoked  on  objects  of  interface  type  as  well. 

To  test  if  an  object  o  is  an  instanceof  class  c  (or  interface  i),  we 
invoke  the  allviews  method  on  o  and  then  check  through  the  result 
Tagged  list  to  see  if  any  of  them  is  tagged  with  Tag^  (or  Tag^). 
The  checkcast  operation  can  be  implemented  in  the  same  way  since 
the  object  associated  with  Tag^  (or  Tag^)  is  already  of  type  c  (or 
i).  Notice  unlike  Glew’s  recent  work  [19],  our  technique  does  not 
use  hierachical  extensible  sums  or  variance-based  subtyping  so  it 
requires  a  much  simpler  target  language.  With  careful  coding,  the 
allviews  method  can  be  implemented  as  efficiently  as  the  untyped 
code  used  in  typical  Java  compilers. 

6.4  Miscellaneous 

Null  references  are  encoded  by  lifting  all  external  object  types  to 
variant  types  with  a  null  alternative,  similar  to  the  option  data 
type  in  Standard  ML.  Then,  all  object  operations  must  first  verify 
that  the  object  pointer  is  not  null.  Since  this  could  fail,  we  also 
need  support  for  unchecked  exceptions,  which  can  be  achieved  by 
adding  an  exception  mechanism  in  the  style  of  Standard  ML. 

Mutually  recursive  class  and  interface  declarations  need  a 
fixed-point  construction  over  the  corresponding  components  in 


MiniFlint;  the  fixed  point  already  used  in  the  definition  of  a  sin¬ 
gle  class  cannot  be  generalized  for  this  purpose  since  this  will  result 
in  “friend”  status  of  the  other  classes. 

Other  aspects  of  Java  such  as  concurrency,  final,  reflection, 
class  loaders,  and  dynamic  loading  are  more  challenging;  we  have 
preliminary  ideas  about  implementing  them,  but  developing  the  de¬ 
tails  remains  as  future  work.  Of  course  our  F’^-based  framework  is 
particularly  well-suited  for  implementing  Java  extensions  based  on 
higher-order  functions  and  types  such  as  those  in  Pizza  [27]. 

7  Related  Work 

Object  and  class  encodings  have  been  extensively  studied.  The  use 
of  row  polymorphism  positions  our  scheme  most  closely  to  that  of 
Remy  and  Vouillon  [32],  however  the  special  object  types  of  Ob¬ 
jective  ML  reduce  the  use  of  row  polymorphism  to  only  the  cases 
of  binary  methods  [4],  while  the  self-application  semantics  of  our 
scheme  uses  rows  to  represent  the  open  type  of  self  even  though 
Java  lacks  binary  methods.  The  unordered  records  of  Objective 
ML  are  geared  towards  support  for  multiple  inheritance;  in  contrast 
our  scheme  makes  single  inheritance  efficient,  while  an  extension 
with  multiple  inheritance  is  still  possible  (but  less  efficient)  using 
interface  views  as  the  basic  representation. 

In  the  context  of  object  encodings  in  F^-based  languages  [5, 
29],  the  genealogy  of  our  encoding  can  be  traced  back  to  encod¬ 
ings  based  on  F-bounded  polymorphism  [7,  13],  from  which  the  F- 
bounds  have  been  eliminated  using  intersection  types,  which  have 
further  been  replaced  by  row  polymorphism  (in  the  case  of  objects) 
or  realized  as  tuples  (in  the  case  of  interface  views).  At  the  most 
basic  level,  method  invocation  uses  self-application  (the  whole  ob¬ 
ject  is  a  parameter  of  each  method),  showing  similarity  with  the 
encoding  due  to  Abadi,  Cardelli,  and  Viswanathan  [1];  however, 
hiding  the  actual  class  of  the  receiver  is  achieved  using  existential 
quantification  over  row  variables  instead  of  splitting  the  object  into 
a  known  interface  and  a  hidden  implementation.  This  allows  reuse 
of  methods  in  subclasses  without  any  overhead  (modulo  type  appli¬ 
cations,  which  are  no-ops  in  the  intended  type-erasure  semantics). 
Further  we  use  an  analog  of  the  recursive-existential  encoding  due 
to  Bruce  [6]  to  give  types  to  other  arguments  or  results  belonging 
to  the  same  class  or  a  subclass,  as  needed  in  Java,  without  over¬ 
restricting  the  type  to  be  the  same  as  the  receiver’s.  The  private 
instance  variables  of  these  objects  are  thus  accessible  by  methods 
of  the  class;  they  are  protected  by  the  final  level  of  encapsulation. 

Fisher  and  Mitchell  [14,  15]  show  how  to  use  extensible  ob¬ 
jects  to  model  Java-like  class  constructs.  Our  encoding  does  not 
rely  on  extensible  objects  as  primitives,  but  it  may  be  viewed  as 
an  implementation  of  some  of  their  properties  in  terms  of  simpler 
constructs.  In  particular,  extensibility  of  objects  is  only  used  when 
they  have  the  role  of  prototypes,  and  the  ability  to  extend  an  ob¬ 
ject  in  general  takes  us  further  from  the  intended  Java  semantics. 
In  our  translation  these  roles  are  clearly  separated.  Our  encoding 
of  classes  guarantees  restricted  visibility  and  (as  a  consequence) 
correct  initialization  of  private  instance  variables,  and  they  provide 
directly  reusable  collections  of  methods  as  well  as  means  to  create 
new  objects.  One  criterion  our  encoding  fails  is  to  automatically 
propagate  changes  in  a  base  class  to  its  descendants,  but  it  is  un¬ 
clear  if  this  criterion  can  coexist  with  Java’s  binary  compatibility. 

8  Conclusions 

We  have  presented  a  formal  translation  of  Java  classes,  interfaces, 
and  privacy  into  a  call-by-value  variant  of  using  simple  and 
well-known  extensions.  Even  though  the  resulting  code  contains 
full  type  information,  the  runtime  object  layout  corresponds  to  what 
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one  might  expect  in  an  untyped  implementation.  The  operations  of 
object  creation,  method  invocation,  and  field  selection  are  imple¬ 
mented  efficiently  in  terms  of  primitive  F‘^  constructs.  Thus,  they 
are  candidates  for  standard  optimizations  and  we  can  reason  about 
their  interaction  with  foreign  code. 

An  implementation  of  this  encoding  is  in  progress.  As  of 
April  1999,  we  finished  a  prototype  implementation  which  can 
translate  all  the  features  of  Javacito  into  a  simple,  interpreted  target 
calculus.  We  are  working  on  connecting  this  simple  target  calcu¬ 
lus  implementation  to  the  full-fledged  FLINT  compiler  [34,  35], 
in  order  to  leverage  its  type-directed  optimizations,  compiler  back 
ends,  and  runtime  support.  The  actual  FLINT  intermediate  lan¬ 
guage  is  surprisingly  close  to  the  target  calculus  presented  here 
(MiniFlint);  all  we  need  is  just  to  add  the  row  polymorphism. 
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